AITopics | back propagation

Collaborating Authors

back propagation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

End-to-end Learning of LDA by Mirror-Descent Back Propagation over a Deep Architecture

Jianshu Chen, Ji He, Yelong Shen, Lin Xiao, Xiaodong He, Jianfeng Gao, Xinying Song, Li Deng

Neural Information Processing SystemsOct-2-2025, 06:17:37 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, topic model, (19 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > Washington > King County > Seattle (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Differential Informed Auto-Encoder

Zhang, Jinrui

arXiv.org Artificial IntelligenceOct-24-2024

If the physics formula was obtained in the form of differential equations, a physics-informed neural network can be built to solve it numerically on a global scale [5, PINN].This process could be seen as a decoder in a way that takes a sample point in the domain of the partial differential equations, and solve it to get the corresponding output of each input point. If only a small and random amount of training data was obtained, to re-sample from the domain, we need to obtain the differential relationship of the data. This process could be viewed as an encoder that encodes the inner structure of the original data.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2410.18593

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

The Unified Balance Theory of Second-Moment Exponential Scaling Optimizers in Visual Tasks

Zhang, Gongyue, Liu, Honghai

arXiv.org Artificial IntelligenceMay-28-2024

Existing first-order optimizers mainly include two branches: classical optimizers represented by Stochastic Gradient Descent (SGD) and adaptive optimizers represented by Adam, along with their many derivatives. The debate over the merits and demerits of these two types of optimizers has persisted for a decade. In practical experience, it is generally considered that SGD is more suitable for tasks like Computer Vision(CV), while adaptive optimizers are widely used in tasks with sparse gradients, such as Large Language Models(LLM). Although adaptive optimizers always offer better convergence speeds in almost all tasks, they can lead to over-fitting in some cases, resulting in poorer generalization performance compared to SGD in certain tasks. Even in Large Language Models, Adam continues to face challenges, and its original strategy may not always have an advantage due to the introduction of improvements such as gradient clipping. With a wide variety of optimization methods available, it is essential to introduce a unified, interpretable theory. This paper will discuss under the framework of first-order optimizers and, through the intervention of the balance theory, will for the first time propose a unified strategy to integrate all first-order optimization methods.

activation function, gradient, optimizer, (14 more...)

arXiv.org Artificial Intelligence

2405.18498

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

End-to-end Learning of LDA by Mirror-Descent Back Propagation over a Deep Architecture

Neural Information Processing SystemsMar-12-2024, 23:15:22 GMT

We develop a fully discriminative learning approach for supervised Latent Dirichlet Allocation (LDA) model using Back Propagation (i.e., BP-sLDA), which maximizes the posterior probability of the prediction variable given the input document. Different from traditional variational learning or Gibbs sampling approaches, the proposed learning method applies (i) the mirror descent algorithm for maximum a posterior inference and (ii) back propagation over a deep architecture together with stochastic gradient/mirror descent for model parameter estimation, leading to scalable and end-to-end discriminative learning of the model. As a byproduct, we also apply this technique to develop a new learning method for the traditional unsupervised LDA model (i.e., BP-LDA). Experimental results on three real-world regression and classification tasks show that the proposed methods significantly outperform the previous supervised topic models, neural networks, and is on par with deep neural networks.

dataset, prediction performance, topic model, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Washington > King County > Seattle (0.14)
Asia > Middle East > Jordan (0.04)
South America > Paraguay > Asunción > Asunción (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Convergence Acceleration of Markov Chain Monte Carlo-based Gradient Descent by Deep Unfolding

Hagiwara, Ryo, Takabe, Satoshi

arXiv.org Machine LearningFeb-21-2024

The proposed solver is based on the Ohzeki method that combines Markov-chain Monte-Carlo (MCMC) and gradient descent, and its step sizes are trained by minimizing a loss function. In the training process, we propose a sampling-based gradient estimation that substitutes auto-differentiation with a variance estimation, thereby circumventing the failure of back propagation due to the non-differentiability of MCMC. The numerical results for a few COPs demonstrated that the proposed solver significantly accelerated the convergence speed compared with the original Ohzeki method. Combinatorial optimization problems (COPs) comprising discrete variables are considered hard to solve exactly in polynomial time, which relates to the well-known P vs. NP problem. Along with deterministic approximation algorithms, samplers such as Markovchain Monte-Carlo (MCMC) have been applied to COPs. However, the convergence time for obtaining reasonable approximate solutions is long.

duom, ohzeki method, step size, (13 more...)

arXiv.org Machine Learning

2402.13608

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.61)

Add feedback

A Novel Method for improving accuracy in neural network by reinstating traditional back propagation technique

R, Gokulprasath

arXiv.org Artificial IntelligenceAug-9-2023

Deep learning has revolutionized the field of artificial intelligence by enabling machines to learn complex patterns and perform tasks that were previously deemed impossible. However, training deep neural networks is a challenging and computationally expensive task that requires optimizing millions or even billions of parameters. The back propagation algorithm has been the go-to method for training [5] deep neural networks for decades, but it suffers from some limitations, such as slow convergence and the vanishing gradient problem. To overcome these limitations, several alternative training methods have been proposed, such as Standard Back propagation and Direct Feedback Alignment. The core idea of this approach is to update the weights and biases in each layer of a neural network using the local error at that layer, rather than back propagating the error from the output layer to the input layer.[2] By doing so, the training process can be accelerated and the model's accuracy can be improved.

artificial intelligence, machine learning, weight and bias, (18 more...)

arXiv.org Artificial Intelligence

2308.05059

Country: Asia > India > Puducherry (0.04)

Genre: Research Report > Promising Solution (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Generalization of Back propagation to Recurrent and Higher Order Neural Networks

Neural Information Processing SystemsApr-6-2023, 20:07:32 GMT

The propagation of activation in these networks is determined by dissipative differential equations. The error signal is backpropagated by integrating an associated differential equation. The method is introduced by applying it to the recurrent generalization of the feedforward backpropagation network. The method is extended to the case of higher order networks and to a constrained dynamical system for training a content addressable memory. The essential feature of the adaptive algorithms is that adaptive equation has a simple outer product form.

back propagation, higher order neural network, propagation, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Induction of Multiscale Temporal Structure

Neural Information Processing SystemsApr-6-2023, 19:18:39 GMT

Learning structure in temporally-extended sequences is a difficult com(cid:173) putational problem because only a fraction of the relevant information is available at any instant. Although variants of back propagation can in principle be used to find structure in sequences, in practice they are not sufficiently powerful to discover arbitrary contingencies, especially those spanning long temporal intervals or involving high order statistics. For example, in designing a connectionist network for music composition, we have encountered the problem that the net is able to learn musical struc(cid:173) ture that occurs locally in time-e.g., relations among notes within a mu(cid:173) sical phrase-but not structure that occurs over longer time periods--e.g., relations among phrases. To address this problem, we require a means of constructing a reduced deacription of the sequence that makes global aspects more explicit or more readily detectable. I propose to achieve this using hidden units that operate with different time constants.

multiscale temporal structure, sequence, temporally-extended sequence, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.81)

Add feedback

The deep learning project which led me to burnout

#artificialintelligenceMar-27-2023, 20:10:26 GMT

In this article, I will present you the deep learning project that I wanted to perform, then I'll present the techniques and approach that I used to tacle this. And I will end up that article with some meaningful reflections, that I hope would help some of you. I wanted to build a smartphone app which can recognize flower from taken picture. Basically the app is splitted into two parts, the front-end part which is basically the mobile development. I wanted to build from scratch a deep learning model without deep learning framework, to help me understand the inner working process of image classification (I know it sounds crazy).

classifier, deep learning project, neural network, (8 more...)

#artificialintelligence

Genre: Play > Prospect (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Back Propagation. Backpropagation is a popular algorithm…

#artificialintelligenceMar-21-2023, 09:41:09 GMT

Backpropagation is a popular algorithm used for training neural networks. Here, X is the input data, y is the corresponding output data, hidden_layer_size is the number of neurons in the hidden layer, learning_rate is the learning rate, and num_iterations is the number of iterations to train the model for. The sigmoid() function computes the sigmoid activation function. Here, we define the sigmoid activation function, which takes in an input value x and returns the output of the sigmoid function. Next, we define the derivative of the sigmoid function, which takes in an input value x and returns the derivative of the sigmoid function with respect to x.

backpropagation, neural network, sigmoid function, (12 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.73)

Add feedback